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Abstract. We analyse and drastically improve the running time of the al- 
gorithm of Mazur, Stein and Tate for computing the canonical cyclotomic 
p-adic height of a point on an elliptic curve E/Q, where E has good ordinary 
reduction at p > 5. 



1. Introduction 

Mazur, Stein and Tate [MST06| recently introduced a fast algorithm for comput- 
ing the canonical, cyclotomic p-adic height hp{P) of a rational point P on an elliptic 
curve E/Q — in this paper, referred to simply as the p-adic height — in the case 
that E has good ordinary reduction at p > 5. Their algorithm finds applications 
in, for example, numerical investigation of phenomena related to the p-adic Birch 
and Swinnerton-Dyer conjectures, since the p-adic height is part of the definition 
of the p-adic regulator term. 

|MST06j did not give a bound for the running time of their algorithm; rather, 
the point of their paper is that the p-adic height can be computed feasibly at all, 
to reasonably high p-adic precision. In this paper we sketch an estimate for the 
running time of the algorithm of [MST06j . and we give several improvements that 
drastically improve the asymptotic running time, both for large p and for high 
precision. With a contemporary desktop computer, it becomes straightforward to 
calculate a handful of p-adic digits of the height when p is around 10^^; and in the 
other direction, for small primes, one easily obtains thousands of p-adic digits. 

We also carefully analyse the amount of p-adic precision that must be maintained 
throughout the calculation to guarantee a given number of correct digits of output; 
|MST06| did not discuss this issue in much detail. Obtaining a sharp bound is 
particularly important when p is large, since computing additional unnecessary 
digits in intermediate calculations becomes very expensive. 

In our running time estimates, we will generally ignore logarithmic factors, 
expressing all estimates in terms of the 'soft-oh' notation 0{X), which means 
0{X{logX)^) for some integer fc > 0. 

To state the main results, we introduce some notation. The elliptic curve E/Q 
of interest is given by the Weierstrass equation 

(1) + aixy + a^y = + a2x'^ + a4X + ag, 

with the usual invariant differential lu — dx/{2y + aix + 03). We assume that the 
ak are integers, but we do not assume that the equation is minimal. We assume 
that p > 5 and that E has good ordinary reduction at p. We treat E as fixed for 
the purposes of estimating running times. 
The following result is proved in 
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Theorem 1. Let N > I. The value of the p-adic modular form E2(-B, a;) may 
be computed mod p^ in time 0{pN^). If p > 6N, it may be computed in time 

The next result, proved in ^ concerns the p-adic sigma function (Tp(<) G Zp[[t]], 
as defined in [MT91| . This is a power series 

ap{t) ^t + C2t^ + C3t^ + ■■■ , 

where t — —x/y is a parameter for the formal group of E. For any r > 1, let be 
the ideal of Zp[[t]] given by 

/,, = (/,/-H,...,pr-\r). 

Theorem 2. Let N > 4. Given the value o/E2(-E,ijj) modp^^^, the p-adic sigma 
function may be computed mod In in time 0{N^logp). 

Computing ap{t) mod In simply means that we are computing the coefficient 
of t'' modulo p^"*"', for each k = 1, . . . , N — 1. The running time is optimal up to 
logarithmic factors, since the amount of data needed to represent crp(t) mod In is 
proportional to iV^logp. Note that for < 3, computing ap{t) mod In is trivial, 
since C2 is given simply by ai/2 (see f|4|). 

Finally, we consider the problem of computing the p-adic height hp{P) mod p^ 
of a point P € E{Q) for some M > 2, given ap{t) as input. Following a suggestion 
of Christian Wuthrich, in this paper we normalise the p-adic height differently from 
[MST06j . our height is equal to the [MST06| height multiplied by the factor 2p. 

Let ni = 4l=E(Fp), let n2 be the least common multiple of the Tamagawa numbers 
of E, and let n = LCM(ni,n2). Put 

M' = M + 2vp{n), 

where Vp denotes the usual additive p-adic valuation. Note that Vp{ni) < 1, and 
we are assuming that 712 = 0(1) since E is fixed, so M' = AI + 0(1). 

The following result is proved in ijsl We denote by Cp the amount of data 
required to represent the coordinates of Cp. 

Theorem 3. Let P G E{Q,), and suppose that (7p(t) is known mod Im'+i- Then 
hp{P) may be computed mod p*^ in time 0{Cp + 7\f log^p + logp). 

Organisation of the paper. In |2]we outline the Mazur-Stein-Tate algorithm, 
and indicate the steps in their algorithm that we intend to accelerate. In iJ31 gland 
fj5] we prove Theorems [l] [2] and [3] respectively, and give several detailed examples 
of the various components of the algorithm. In f}6]we give a higher- level example of 
the whole algorithm in operation. Finally in fj7]we give some examples of timings 
for an implementation of the algorithm. 

Acknowledgements. I would like to thank first of all William Stein for introduc- 
ing me to the problem of computing p-adic heights, and for making many helpful 
suggestions on an early version of this paper. At the MSRI workshop "Computing 
with Modular Forms" (August 2006), I benefited greatly from working with Robert 
Bradshaw, Jennifer Balakrishnan and Liang Xiao on an implementation of the orig- 
inal Mazur-Stein-Tate algorithm in the computer algebra system SAGE |SJ05| . A 
discussion with Christian Wuthrich regarding portions of his thesis helped cement 
the ideas for §5.11 Thanks also to Kiran Kedlaya for pointing out the method 
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described in |Berj for computing p-adic logarithms, and for many interesting con- 
versations about his point-counting algorithm. 

2. A SKETCH OF THE MAZUR-StEIN-TATE ALGORITHM 

In this section we sketch the original algorithm of [MST06| . and indicate the 
bottlenecks in the time complexity that this paper seeks to address. 

The key insight of |MST06| is that it is possible to compute E2{E,uj) efficiently 
using Kedlaya's algorithm [KedOlj . and then to use the value of E2{E, uj) to deduce 
the p-adic sigma function. The p-adic height of a point P G E{Q) is then given 
in terms of <yp(t) by a simple formula. Their method for computing E2(i?,w) is 
discussed in the proof of Theorem [T] in fj3l We now consider the other steps. 

2.1. Computing the p-adic sigma function. The algorithm uses the fact that 
(jp(t) is the unique odd function in Zp[[t]] of the form (7p{t) — t + ■ ■ ■ satisfying the 
differential equation 

2 x[t) + c= — [ , 

UU \(J UJ J 

where c is the constant 

_ a\ +4a2 - ¥.2{E,uj) 



and where x{t) G Zp[[f]] is the power series expansion of x at the origin. Since c is 
known from E2(-E, w), it becomes a matter of computing and then solving ([2]) 
for ap{t). 

The series x{t) is determined from an auxiliary power series 'w{t) — X)n>o ^nt^ i 
using a certain recursive formula to compute the Sn- A bottleneck arises here, in 
that the recursive formula for s„ involves a term of the form X)i+j+fe=n ^i^j^k, 
which requires 0{v?) ring operations to evaluate. To compute the p-adic height to 
precision p^, it turns out to be necessary to compute 0{N) coefficients, to precision 
about each. Therefore computing x{t) to the desired precision requires 0{N^) 
ring operations, for a time cost proportional to at least N^\og{p^) — N*logp. In 
21 we will give a different method for computing x{t) whose running time is only 
d{N^ logp). 

To solve ([U, they first use the formal logarithm to to change variables from 
t to the parameter z on the additive group. After making this substitution, the 
differential equation takes the simpler form 

{ ) + -■ d / 1 da\ 
dz \a{z) dz J ' 

which can be solved by integration and exponentation of power series. A bottleneck 
arises here also: to perform the change of variables, it is necessary to invoke several 
power series composition and reversion operations. To the author's knowledge, the 
Brent -Kung algorithm [BK78j is the best algorithm available for computing series 
reversions and compositions. It has complexity 0{'n?^'^) in the number of terms, so 
when working to precision p^ the running time would be proportional to at least 
jV^/^logp. In [21 we show that it is entirely unnecessary to change variables; the 
original equation Q can be solved directly in time 0{N^ ^ogp). 
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2.2. Computing the p-adic height. Let P E i?(Q) be a nonzero point. Its affine 
coordinates may be written uniquely in the form 



where { a{P),d{P )) = {(3{P),d{P)) = 1 and d{P) > 1. 

First |MST06| consider the case that P satisfies two conditions: 

(Al) P reduces to the zero element of E{Fp); and 

(A2) P reduces to a nonsingular point of E{F£) for all primes ( at which E has 
bad reduction. 

Condition (Al) implies that t{P) = —x{P)/y{P) is divisible by p. Under these 
conditions, hp{P) is given by the formula 



where logp is the Iwasawa p-adic logarithm, and where (t(P) = (T{t{P)). (Recall 
that our normalisation for hp{P) differs from that of [MST06| by a factor of 2p.) 

To handle arbitrary nonzero Q £ E{Q), they proceed as follows. As in Theorem 
131 let ni = #i?(Fp), let n2 be the least common multiple of the Tamagawa numbers 
of E, and let n = LCM(ni, 712). Then P — nQ does satisfy (Al) and (A2), so one 
may use ^ to compute hp{P). From this one obtains hp{Q) = hp{P)/n'^, since hp 
is a quadratic form. 

The above procedure has a serious bottleneck when p is large. The difficulty is 
that one must actually compute the coordinates of nQ, which will generally have 
about digits (assuming that Q is non-torsion). From the Weil bound we have 
^E(Fp) w p, so n is roughly proportional to p. Therefore the time complexity 
is at least proportional to p^ , and for sufficiently large p will dwarf even the time 
required to compute E2{E,lu). In fJS] we will show how to avoid this problem by 
judicious use of the division polynomials associated to E, achieving a running time 
only poly logarithmic in p. 

3. Computing E2{E,uj) 

In this section we prove Theorem [T] given iV > 1, we wish to compute F,2{E, oj) 
to precision p^ . 

The first step is to compute the matrix F of the absolute p-th power Frobenius on 
the basis {dx/y,xdx/y} of the Monsky-Washnitzer cohomology of E, to precision 
p^. This may be done using either Kedlaya's algorithm [KedOlj in time 0{pN^), 
or a p > 6N one may use a modification of Kedlaya's algorithm |Har07| that has 
running 

timeO(pi/2jY5/2-)^ The optimal crossover point between the two algorithms 
would depend on the implementations; one expects it to be roughly of the form 
p = C7V for some constant C. 

As explained in |MST06[ §3.2], the value of E,2iE,uj) may be deduced from the 
matrix as follows. Write 




(3) 





Then 



E2{E,uj) 



12B/D (modp^). 
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Note that D is a unit (see jMST06| §3.2]), so the resuh is correct raod . Using 
repeated squaring, computing requires 0(log N) matrix multiphcations, each 
taking time 0{N log p), so the time for this step is only 0{N log p). 

3.1. A constant factor improvement. Kedlaya's algorithm was originally de- 
signed for hyperelliptic curves, but because we are in the special setting of an 
elliptic curve, it is possible to obtain a constant factor speedup. Indeed, the trace 
of the matrix may be determined by any of various fast algorithms for counting 
^E(Fp), and the determinant is known to be p. Since Kedlaya's algorithm com- 
putes the two columns of the matrix independently, the idea is simply to compute 
only one column, and fill in the other column from knowledge of the trace and 
determinant. (Alternatively, one could use the trace and determinant as a strong 
correctness check.) This idea was communicated to the author by William Stein, 
who attributes it to Eyal Goren. 

The speedup will be at most a factor of two. In practice it will be less, since 
it only affects "Step 2" of Kedlaya's algorithm, not "Step 1" (in the language of 
[KedOli §4]). It only applies to Kedlaya's original algorithm; it does not apply to 
the algorithm of [Har07| . 

In practice some care is needed to avoid precision loss. A particularly badly 
conditioned example is given by the curve = a;'^ + 7a; + 8 at the prime p = 11, 
whose Frobenius matrix is given to precision 0(11'^) by 

/ll • 104 + 0(11^) 11-16 + 0(113)\ 
[u^ - 7 + 0(11'^) 185 + 0(113) )■ 

If the trace and determinant are known, then the second column mod 11^ only 
determines the first column mod 11^, and the first column mod 11"^ only determines 
the second column mod 11. 

This problem may be avoided by a simple change-of-basis argument, as follows. 
One first uses Kedlaya's algorithm to compute the second column only mod p. The 
running time for this step is 0{p). If the top-right entry is a unit, then no precision 
loss will arise; one simply uses Kedlaya's algorithm at full precision p^ to compute 
the second column, and this suffices to determine the first column mod p^ from 
the trace and determinant. 

In the unlucky event that the top-right entry is not a unit, one instead runs Ked- 
laya's algorithm at precision p^ with respect to the basis {dx/y, (1 -I- x)dx/y}. In 
other words, one applies Kedlaya's reduction algorithm to the image under Frobe- 
nius of the differential (1 + x)dx/y, and expresses the result as a linear combination 
of dx/y and (1 + x)dx/y. The top-right entry of the matrix on this new basis is 
then guaranteed to be a unit, so the second column is sufficient to determine the 
first column mod p^ (and the trace does not depend on the choice of basis). Con- 
jugating by the change-of-basis matrix then yields the matrix on the original basis, 
to full precision. 

4. Computing the p-adic sigma function 

This section is devoted to proving Theorem [2] We assume that A > 4 and that 
c = {a\ + 4a2 — £2)712 is known mod p^~^. We wish to compute <Jp{t) mod Im, 
in time 0{N^ logp). 
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Throughout this section wc assume that asymptoticaUy fast polynomial arith- 
metic is being used; specifically, that multiplication and division of polynomials of 
degree d over Zi/p^Ti may be accomplished in time O {dr log p). 

Let x(t) and y{t) be the power series expansions of x and y around the origin. 
Let w{t) = —l/y{t), and let 

x'it) 



2y{t) + aix{t) + as ' 



so that Lij{t)dt is the series expansion of the invariant differential lj. The first few 
terms of each series are given by 

x{t) — — ait^^ — 02 + • ■ ■ , 

y{t) = -t"^ + aii"2 + a2t^^ H , 

w[t) + ait^ + [al + a2)t^ + • • • , 

uj{t) = 1 + flii + {af + a2)t^ H . 

(See also |Sil921 Ch. IV], which discusses these expansions in some detail, using 
slightly different notation.) 

Let R = 7i/p^~'^7i. Our first task is to compute the above series over _R, with 

+ 1 terms each. 

Proposition 4. The series t^x{t), t^y{t), t~^w(t) and uj{t) may be computed up 
to 0(<^^^), with coefficients in R, in time 0{N^\ogp). 

Proof. The series w{t) satisfies the algebraic equation 

w ^ t'^ + aitw + a2t^w + a^uP' + a^tuP + agw'^, 

and so it may be solved using Newton's method. We start with the initial approx- 
imation w{t) = t^ , and then repeatedly apply the Newton iteration 

, w — — aitw — a2t^w — a^w"^ — a^tuP' — oquP 



1 — ait — a2t^ — 2031/7 — 2a4tw — Sogw^ 
t^ — a^w"^ ~ aitw'^ - 2aQW^ 
1 — ait — a2t^ — 2aj,w — 2aitw — Za^vP 

The arithmetic is performed using truncated power series in R[t\. It is straightfor- 
ward to check that the number of correct terms doubles with each iteration. Each 
iteration requires several power series multiplications and one inversion. As is typ- 
ical in applications of Newton's method to power series, the total time to obtain n 
terms is a constant multiple of the time required for a length n polynomial multi- 
plication. Therefore the total time to compute 0{N) terms of w{t) with coefficients 
in R is 0(iVlog(p^-3)) ^ d{N^\ogp). 

Given up to 0(i^+^), we may then deduce = t/w{t),y(t) = —l/w{t) 

and ijj{t) = x' {t)/{2y{t) + aix{t) + 03), also with coefficients in R, to the desired 
number of terms, in time 0{N^ logp). □ 



Next we consider the differential equation which may be rewritten as 
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It is convenient to rephrase this in terms of the unit power series 

0{t)=t-^ap{t) = ! + ■■■ e Zp[[t]]. 

To solve for ap{t) mod In it suffices to solve for 9{t) mod In-i- We have 6'{t) /9{t) 
(jp{t)/ap{t) +t^^, so 9{t) satisfies the equation 

-I / I /I 



m\\' 



(5) x(t) + c = , , , , , , I N 1 
Manipulating this equation formally, we obtain 

(6) ^^'^^^^ ^^^'[M]' 
where 

(7) h{t) = -\- {^j {x{t) + c)uj{t)dt + C 

for some constant of integration C. (The formal integration operator is assumed to 
output a series with zero constant term.) 

The constant C is determined by the condition that (Tp(t) is an odd function. 
This condition means that ap{i(t)) = —Gpit), where i{t) = —t — ait^ + ■ ■ ■ is the 
formal inverse law |Sil921 IV §1]. It implies that the coefRcient of in ap{t) is ai/2, 
so that the coefficient of t in 9{t) is ai/2. Substituting this into ^ and using the 
expansions for uj(t) and x{t) given earlier, one finds that C = ai/2. 

Proposition 5. The series h{t) may he computed, with the coejficient ofP known 
morf p^^'^^Liogp(j)J fg^ f < J < N, and with the constant term known modp^^^, 
in time 0{N^ logp). 

In the above proposition, note that logp(j) refers to the base p logarithm, not 
the p-adic logarithm. 

Proof. First we compute {x{t) + c)Lu{t). From Proposition [4] this series is known 
up to 0{t^~^) with coefficients in i? = Z/p^~^Z. To integrate it, we must check 
that the coefficient of P has valuation at least Vp{j + 1) for each j; this follows for 
example from ([5]). (This fact is the basis for the 'integrality algorithm' of [MST06| .) 
After integrating, the series 

>(t) + C)u{t)dt + y 

is known up to 0{t^), but the coefficient of P is known only mod p^^^-^pli)^ 
because of the divisions by powers of p in the integration step. Then we must 
multiply by — w(i) and subtract l/t, to obtain 

h{t) = - uj{t) (^J {x{t) + c)uj{t)dt + y 

This series is also known up to 0{t^)\ the effect of multiplying by uj{t) is to prop- 
agate the errors to higher order terms, so now the coefficient of P is known only 
mod pW-3-Liogp(j)J. 

The constant term must be determined mod p^^"^ separately; from the explicit 
series expansions given above, it is simply ai/2. □ 
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Finally we must solve ([6]) for 9{t). Formally the solution is given by 

9{t) = exp h{t)d?j . 

Computationally this is problematic because both the integration and exponenti- 
ation steps introduce denominators, and also some p-adic precision is lost. The 
following result gives an efficient means to solve equations of the form F' /F = f, 
without any denominators appearing in intermediate steps, and with good control 
over precision loss. 

Proposition 6. Let k > 1 and 1 < n < y^l"^ , and let S — Z/p'^Z. Let f e 

S[t]/{f''^^), and suppose that there exists F £ S[t]/{t"), with F{0) — 1, such that 
F'/F ~ f. Then F may be determined mod J in time 0{nk\ogp), where J is the 
ideal of S[t]/ {t") given by 

J = (/-4^p'=-v,...). 

Remark. Note that J is the ideal that captures the types of p-adic error terms that 
occur in a power series integration. Namely, if g G S[t]/{t"'~^), and if the coefficient 
of P in g is divisible by j + 1 for each j, then g has an integral in S'[t]/(t"). In 
general it is not uniquely determined, but it is determined at least mod J. 

Proof. The algorithm we describe is essentially that of Brent |Bre76| . with some 
additional analysis to track the p-adic error terms. 

We begin with an initial approximation Fo{t) = 1 G S'[i]/(i"). We will refine 
it iteratively, obtaining a sequence Fi € S[t]/{t") such that Fi — F € J + (t^ ) 
for each i > 0. After [log2 n] steps, we will have Fi — F E J as desired. Each 
step is dominated by a polynomial multiplication and a division of length 2*, so 
the total time is a constant multiple of the time required for a single polynomial 
multiplication of length n, which is 0{nlog{p'^)) — 0{nk\ogp). 

Now we explain the iterative step. Suppose that Fi — F E J + {t^ ). Since F is 
invertible, we have 

- (1 + e)F 

for some e G J + {t^'), and so 

n .F' e' e' 
Fi ■' F 1 + e 1 + e' 

Morally speaking we would like to integrate F^/Fi — / to obtain log(l + e), but 
the latter does not make sense in our ring. Instead, we write 

e'e-' 



1 + e 1 + e 

The hypothesis n < p^l"^ implies that = 0, so that e^ G J. We also have 
e' G J + {f'^^), so e'e^ G (t^'^'"^). Consequently 

FI . , . , 



Therefore Fl/Fi — f may be integrated at least up to 0{t^ ^ ); that is, we may 
compute a G G S[t]/{t"-) such that 



2 

GGe-y + J+(t2'^') 
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(The extra J term is introduced by errors in the integration.) Since G J we have 
in fact 

Now it is straightforward to define i^i+i; we simply take 

F,+i^F,{l-G). 
From the above estimates this satisfies 

F,+i eF(l+e)(l-£) + J+(i2'+') 

= F ~ e^F + J +{t^^^^) 
= F + J+if'^") 

as desired. □ 

Now we may complete the proof of Theorem [21 Let h{t) be the approximation 
to h{t) produced by Proposition [5l treated as an element of Zp[[t]], and let 



0{t) = exp h{t) d?j e Clp[[t]] 
exp ( / h{t) - h{t) dt 



Observe that 

9{t) 

By Proposition O for 1 < j < the coefficient of P in ft, — /i has valuation at least 
— 3 — [logp jj , so the coefficient of t-' in J h — h has valuation at least 

iV - 3 - Llogp(j - 1)J - [logpjj. 

Therefore the coefficient of P in exp(/ h — h) — 1 has valuation at least 

3 



iV-3- Llogp(j-l)J - [logpjj 
because Vp{j\) < j / {p— 1). Since p > 5 we have 

3+Liogp(j-i)J + LiogpjJ 



J 



p-1 



for all j > 2. (It suffices to estimate the left hand side for p ~ 5. For large enough 
j, the j/4 term dominates; one must also check a few small values of j directly.) 

This shows that the P coefficients of 6{t) and 6{t) agree mod p^~^~^ for 2 < 
j < N. This holds for j = 1 also, because of the extra condition on the constant 
term of h{t) in Proposition O In other words, we have 6 — 9 Cz In-i- In particular, 
the coefficients of 9 are integral up to 0(t^~^), and the hypotheses of Proposition 
El are satisfied with F = 9, f ^ h, k ^ N - 2 and n = N - 1. 

The output of Proposition [5] is 9 mod J. This determines 9 mod In-i, except 
for the constant term, which is known to be ai/2. 

Example 7. We will walk through the steps of computing ap{t) for the curve 

E ■.y'^ + xy + y = x^ - 460x - 3830, 

which is '26a2' in Cremona's database, and the prime p = 5. We will take A'' = 9; 
that is, we will determine the coefficients of t,P, . . . mod 5*, 5 5^ respec- 
tively. 
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First we must compute w{t) up to 0{t^'^), with coefficients mod 5^, using Propo- 
sition ID The successive approximations are 

w{t) ^^t"" + 0(t^), 

w{t) =1^ + 1"^ + t' + 2f + 1516M^ + 0{t^), 
w{t) =t^ + t^ + V' + 2t^ + 1516M^ + 14252^** + 9048i^ 
+ 9516t^° + 9477t" + 14344*^2 + 0{t^^). 
From this we obtain the other series attached to E, 

y{t) ^ -l/w{t) = -t-^ +t-^ + 1 + Ibimt + 15166t^ 

-t- 11337^3 + 6589^-* + 9397^^ + 8273t^ + 0{t^), 

x{t) = t/w{t) = -t~^ - t + 459^^ + 459i^ 

+ 4288t'* + 9036i^ + 6228*^ + 7352i^ + 0{t^), 

uj{t) = x'{t)l{2y{t) + aix{t) + a^) = I + t + + it^ + 14712i* + 12878^^ 

+ 14267t'5 + 1881t^ + 4058t^ + 2267i^ + 0(t^"). 

Now we apply Proposition [5l Let us assume that E2 = 4303 + 0(5^) has been 
computed in advance, so we have 

12 + 4-0- 4303 ^^^^ 
c = — = 7454 + 0{b^), 

and then 

{x{t) + c)Lj{t) = + 7454 + 7455i + 6996*^ + 5820i^+ 

13590i'' + 11924t^ + 15504*^ + 1081t^ + 0(t®). 

Observe that the term is zero, so the series may be integrated. Our choice of c 
ensures that the term is divisible by p = 5, so that the integral has coefficients 
in Z5: 

j [x{t) + c)Lo{t)dt + y = -t"^ + 7813 + 7454t + 11540i2+ 

2332i3 + 1455<'' + 2718i^ + 12404^*^ + 4447t^ + 13807i*^ + O(t^). 

Note that the coefficient is now only correct mod 5^ , because we lost a digit during 
the integration (in fact the correct coefficient is 12093 (mod 5^)). We obtain 

h{t) = y^- - uj{t) (^J {x{t) + c)uj{t)dt + y 

= 7813 + 359i + 4446^^ + ll97t^ + 14708^" 

+ 6580t''^ + 6770i^ + 1524i^ + 2441t^ + 0{t^). 

The preceding multiplication by w(i) caused the incorrect digit to wash through to 
the higher order terms, so now the through coefficients are only correct mod 
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5^. Also, the constant term ai/2 is only correct mod 5^, but we need to increase 
its precision to 5^, so h{t) becomes 

h{t) = 39063 + 359i + Um^ + 1197^3 + 14708^'' + 6580^^ 

+ 6770t^ + 1524:f + 2441t^ + 0{t^). 

Now we run Brent's algorithm (Proposition [S]) to solve for 9{t), with the poly- 
nomial arithmetic performed mod S''. The successive approximations are 

e{t) = i + o{t), 

e{t) = l + 39063t + Ci(t2), 

e{t) = 1 + 39063* + 68539*2 _^ 12965*^ + 0(t''), 

e{t) = 1 + 39063i + 68539^2 + 12965^^ + 30804^^^ + 14720^^ 

+ 10063^^ + 25830t^ + 0{t^). 

Finally CTp(i) is obtained as t9{t), but Theorem [2] only guarantees the result is 
correct mod /at, so we have 

ap{t) = t + (39063 + 0{b''))t^ + (6039 + 0{^^))t^ 

+ (465 + 0(55))t'* + (179 + 0(5^))^^ + (95 + 0{^^))t^ 

+ (13 + 0(52))^^ + (0 + 0(5))i« + 0(t9). 

5. Computing the p-adic height 

In this section we prove Theorem [3] Recall that ni = ^E{Fp), n2 is the least 
common multiple of the Tamagawa numbers of E, and n — LCM(rii, 71,2). We are 
given a point P G -E'(Q) as input, whose coordinates require space Cp to store. 
Given the sigma function mod /j\/'+i as input, where 

M' = M + 2vp{n), 

we wish to compute hp{P) to precision p*^, in time 0{Cp + Mlog^p + logp). 

As noted in [J2l computing riP directly is not feasible for large p. Therefore we 
take a more indirect approach. In what follows, the symbols a{P), /3{P), d{P), 
x{P), y{P) and t{P) are the same as those defined in §2.21 

First we compute Q = n2P, so that Q satisfies the condition (A2) defined in 

Now let m = n/n2. Because mQ {— nP) satisfies both conditions (Al) and 
(A2), we may express hp(raQ) using ([3]). Since hp is a quadratic form, we obtain 

To determine hp{P) mod p*^, it suffices to compute 
mod p*^ , where we have expanded (Tp(t) as 

ap{t) ^ t + 02^ + C3t^ -\ . 
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Note that a{mQ) / (3{mQ) is a unit, since both a{mQ) and l3{mQ) are relatively 
prime to d{mQ), which is divisible by p. Also 1 + X]fc>i Cfe+it(mQ)'^ is a unit, again 
because t{mQ) = —d{mQ)a{mQ)/(3{mQ) is divisible by p. 

Lemma 8. Suppose that u G Z* is a unit and that N > 1 is an integer. To 
determine log^ u mod p^ it suffices to know u mod . 

Proof. From log^ u — ^^logp{uP~^) we may assume that u = 1 (modp). Since 
logp(u+£) = logp u+logp{l+e/u), the result amounts to checking that if Vp{e) > N, 
then Wp(logp(l + ej) > N. Using the power series expansion of logp(l + x), this 
follows from the elementary estimate Vp(e^ /n) > VpS, that is, Vp{n) < n — 1 for all 
n>l. □ 

By Lemma m it suffices to compute 

a{mQ) 1 1 ^Nfc 

and 1 + > Ck+it{mQr 
l3{mQ) ^ 

mod p^^ . By hypothesis we have available the sigma function mod Ij^jr^i, which 
means that we know cj mod p*^ +^-3 for 2 < j < M'. Since p divides t{mQ), it 
therefore suffices to compute a{mQ)/(3{mQ) and t{mQ) mod p^ . For this, we 
have the following result, which is proved in t j5. II below. 

Proposition 9. Suppose that Q £ E(Q) reduces to a non- singular point of E(Fi) 
for every prime £ at which E has bad reduction. Let R > 1 be odd, and m > 2. 
Given the values of a{Q), P{Q) and d{Q) mod R, one may compute ±a{mQ), 
(3{mQ) and ±d{mQ) mod R in time 0(logi?logm). 

(The ± symbols indicate that a{mQ) and d(rnQ) will be correct only up to sign, 
and that the signs will agree.) 

Proposition [5] completely avoids the problem of the coordinates of mQ growing 
out of control, since the dependence on m is only logarithmic. 

We apply Proposition [9] with R = p^^ , to determine a{mQ) / f3{mQ) and t{mQ) 
modp^^ . The sign ambiguity is irrelevant, since in i(m(5) = —d(mQ)a{mQ)/ P(mQ) 
the signs cancel out, and the sign of a(mQ) / P(mQ) is not needed since the p-adic 
logarithm is insensitive to the sign of its input. 

Now we analyse the running time. Recall that n2 is assumed to be 0(1), 
so the time for computing Q — is 0{Cp). Applying Proposition [9] costs 

O (log (p*^') log to). We have M' = 0{M) and to = 0(p), so this is 0(M log^p). We 
must substitute t{mQ) into which requires 0{M') ring operations in Z/p^^ Z, 
costing time 0(M'log(p*^ ) = 0(M^ logp). Finally we must compute the p-adic 
logarithm in ([5]). Using the series expansion of logp(l + a;), this requires 0{M') ring 

operations, for a cost of 0(M^ logp); with more effort one can obtain 0(Af logp) 
[BiR §16]. 

5.1. Proof of Proposition [9l Our main tools in the proof are the division poly- 
nomials ipm associated to our choice of Weierstrass equation for E, and a non- 
cancellation result of Wuthrich |Wut04i Prop. IV. 2] that in effect controls the 
amount of cancellation that can occur while computing mQ from Q. For further 
background on division polynomials, including proofs of the assertions we use in 
this section, we refer the reader to |MT91[ Appendix I] and [Lan78[ Ch. II]. 
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The relevance of division polynomials is that they appear in a simple formula 
for the coordinates of mQ in terms of the coordinates of Q. For an integer to > 1, 
and a non-torsion point Q, we have 

(10) <rnQ) y{mQ) ""^^^ 



where 9m,ujm G Q(£') are certain auxiliary functions defined in terms of the ipm- 

The quantities tpm{Q) , 0m{Q) , i^miQ) € Q will generally not be integers, due to 
the coordinates of Q themselves having denominators. It is convenient to introduce 
a normalising factor that absorbs these denominators. Accordingly we set 

Note that ■0^, Qm and oJm are defined only on i?(Q) — they are most definitely 
not rational functions on E. One checks, by examining the degrees of ^p^, 6„i and 
LO„i, that ipmiQ), Qm{Q) and CjmiQ) are all integers. Therefore we now have two 
representations of x{mQ) and y{mQ) as ratios of integers, 



x{mQ) 

(11) 

y{mQ) 



(3{mQ) _ Wm(Q) 



The point of pT|) is twofold. First, Proposition IV. 2 of |Wut04j guarantees that, 
under the hypothesis of Proposition [51 the fractions on the right are in fact reduced 
fractions; in other words, that d{mQ) — ±'tjj„i{Q)d{Q). Therefore we may conclude 
from (ini) that 

dimQ) - ±V>„(Q)d(Q), 
(12) a{mQ)=em{Q), 

f3{mQ) = ±UraiQ), 

where the choices of signs in the first and third equations agree. 

Second, we will show that tpmiQ), Qm{Q) and Lbm{Q) can be efficiently computed 
mod R using the usual recursion formulae for the division polynomials. For our 
application, it is not necessary to compute the division polynomials themselves — 
in fact, to do so would completely miss the point, since their degree grows like 
w? , which is precisely the rate of growth that we are trying to avoid. Rather, we 
need only compute their values at Q, and only mod R. Fortunately the standard 
recursive formulae for division polynomials, with minor modifications, are perfectly 
suited to this task. 

To avoid losing p-adic precision, we give a version of the formulae that involve 
no divisions (apart from one division by 2, which is permitted since we have as- 
sumed that 2 is invertible in R). Also, our formulae are tailored to computing the 
normalised versions ipmiQ), Om(Q) and uimiQ) directly, instead of ipmiQ), Om{Q) 
and cumiQ), so that we can work with integral quantities throughout. 
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In the formulae below, we abbreviate a{Q) by a, and similarly with the other 
variables. We start by defining normalised versions of the coefficients Uk of the 
elliptic curve, setting 

CLk^d^ak, fc = 1,2,3,4,6. 

We next define variables bk and B^, which are normalised versions of the bk and 
Bk appearing in |MT91| : 



b2 


= al + 


4(22, 




= aias 


+ 2a4, 


k 


+ 


4a6, 


h 




+ 4a2a6 - 010304 + 0203 — 


B4 


= 6a^ - 


f 62Q: + 64, 


Be 


= 4a^ - 


f 620:^ + 2640; + Se, 


Bs 


= 3a*- 


f 620:^ + Sha'^ + 3bea + bs- 



Similarly, we need normalised versions of the g„i from [MT91j . defined by 
9o = 0, gi = 1, 52 = -1, 33 ^8, Bl- B^Bg,, 

and then recursively for m > 5 by 

_ \ Blgn+2gl - Qn-igl+i-: n even, 
(13) |^5„+2g;^ - B|g„-ig;^+i, n odd, 

52n = 3n(5ri-25n+l ~ 5n+2ff?i_l)- 

Finally, the values of ipm, Om and tt>„j for m > 2 are given in terms of the gm by 
T = 2f3 + CLia + 03, 
— 'ff"(™+i)^ 

9m = a^A^ - V'm+lV'm-1, 
<^m^^ {f''^"'Hgrn~2gfn+l " 5m+2.9™-l) + i^m{aAn + OsV^m)) , 

where (y{k) is or 1 accordingly as k is even or odd. 



(14) 



The algorithm implementing Proposition[5]now runs as follows. All computations 
are performed mod R. We are given as input a{Q), P{Q) and d{Q)^ and the 
constants Ofc. From (|12p it suffices to compute ipm, and ujm- 

We start by computing all of the dk^ bk and Bfe, and 50 through 34, using the 
formulae given above. Then, using (fT3|) . recursively compute gm-2 through gm+2- 
During this recursive step, it is important to retain the values of gj as they are 
computed, since many of them will be reused. Finally, the equations (fT4|) then 
determine ij^m, Om and dim- 

Now we analyse the complexity. Arithmetic operations in Ti/RTi may be per- 
formed in time 0(logi?), so the crux of the matter is to show that the recursive 
formulae (fT3l) are evaluated at most O(logm) times. 

Let k > A. To determine gj for all j in the range k < j < k + 7 , using (fT3| it 
suffices to know gj for (fc - 3)/2 < j < (fc + ll)/2 if k is odd, or (fc - 4)/2 < j < 
(fc + 10) /2 if k is even. In other words, to determine 8 consecutive values of gj near 
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j — fc, it suffices to know 8 consecutive values of gj near j — k/2. Iterating this 
process, to compute gk one must evaluate (fT3|) at most 81og2(fc) = O(logfc) times. 

Example 10. Let P = (5/4, —3/8) be a generator of -E(Q) where E is the elliptic 
curve y'^ + y = + x'^ ~ 7x + 5. This curve is '91bl' in Cremona's database, and 
has conductor 91 = 7 • 13. One checks directly that P reduces to a non-singular 
point at the bad primes 7 and 13. 

We will illustrate Proposition [9] for the case R = 99, m = 101. The starting 
point P is specified by 

a = 5, /3 = 96, d = 2. 
The various constants are initialised as 



ai 


= 0, 


b2 


= 16, 


Bi 


= 6, 


0.2 


= 4, 


h 


= 73, 


Be 


= 4, 


as 


= 8, 




= 57, 


Bs 


= 67, 


04 


= 86, 


k 


= 59, 


f 


= 2. 


06 


= 23, 











We now list the intermediate results in the computation of a{mP), j3{mP) and 
d{niP) mod R. The required values of cjm are 



30 


= 0, 


56 


= 63, 


512 


= 0, 


523 


= 35, 


548 


= 0, 


599 


= 49, 


.91 


= 1, 


.97 


= 98, 


513 


= 64, 


524 


= 0, 


549 


= 1, 


5100 


= 19, 


32 


= 98, 


58 


= 35, 


514 


= 71, 


.925 


= 91, 


.950 


= 62, 


5101 


= 82, 


ff3 


= 67, 


59 


= 50, 


515 


= 4, 


.926 


= 17, 


551 


= 49, 


5102 


= 72, 


ff4 


= 10, 


510 


= 73, 


516 


= 1, 


527 


= 67, 


552 


= 46, 


5103 


= 98. 


55 


= 37, 


.911 


= 98, 


522 


= 1, 


528 


= 46, 


553 


= 1, 







From these it follows that 

t/'ioo = 38, V^ioi = 82, -0102 = 45, 

and 

0101 = 32, a)ioi = 4. 

Therefore we obtain 

a(lOlP) = 32, 
/3(101P) = ±4, 
d{lQlP) = ±2- 82 = ±65. 
The result may be verified by using a machine to explicitly compute the coordi- 
nates of lOlP; it turns out that the negative sign was the correct one. 

6. A COMPLETE EXAMPLE 

We will illustrate the main algorithm by stepping through a computation of the 
p-adic height of a rational point on the curve 

E: y^ + xy = x^ ~l2x + 16, 

which is '214al' from Cremona's database. It has conductor 214 = 2 • 107, and its 
rank over Q is one, with generator 

P=(0,-4). 
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To make life interesting, we will take p — 43, which is anomalous for E since 
#-E(F43) — 43. We will aim to compute hp{P) to precision , so set M = 6. The 
Tamagawa numbers of i? at 2 and 107 are respectively 7 and 1; therefore we set 

m = #£;(r43) = 43, 

n2 = LCM(7, 1) 7, 

n = LCM(7ii,n2) = 301, 

m = n/n2 — 43, 

M' M + 2u43(n) = 8. 

To apply Theorem [3] we need (7p(t) mod /g; from Theorem [5] this means we 

require E2 modulo p^ . To be able to apply Kedlaya's algorithm, it is convenient to 

change coordinates to put the curve into the Weierstrass form = + ax + b; for 

our curve this turns out to be 

, , .577 14689 

E': = x^ x-\ . 

^ 48 864 

(Note that the equation E' has the same discriminant A = —13696 as our original 

equation, so the value of E2 we compute for E' will be the correct value also for E. 

If instead we chose an equation whose discriminant differed by a factor of u^^, then 

we would need to adjust the computed E2 by u^, since E2 is of weight two. In other 

words, we need to take into account the choice of invariant differential implied by 

the choice of equation, as E2{E, oj) is a function of both E and oj.) 

We apply Kedlaya's algorithm — omitting the details — to find that the Frobe- 

nius matrix is 

A996923274 3651910366\ , 6^ 
1^1002107518 1324439776/ l^od p j. 

Then 

6 _ /3987851820 483786047l\ , 6^ 
^1^1528699020 2333368599/ 

so 

4837860471 ^ e 

2333368599 ^ ^ ' 

Using E2 as input. Theorem [5] then yields the sigma function, 

ap{t) = t + (135909305554+ 0(/))t2 + (3933286396 + 0(/))t3 

+ (129848206+ 0(/))t^ + (2650487+ 0(p'*))t^ 

+ (77893 + 0(p^))t^ + (1561 + 0{p^))t^ + (8 + 0{p))t^ + 0{t^). 

(See Example [7] for a more detailed example of computing ap{t).) 

Now we bring into the picture the particular P whose height is sought. We 
compute 

/3 -25' 



A 8 

so that a{Q) = 3, P{Q) — —25, and d{Q) — 2. Using Proposition [9l we then find 
that, mod p^ , 

a{mQ) = 9491762277279, 
/3(mg) = ±10171094217691, 
d{mQ) = ±3360349669562. 
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(See Example [TOl for a more detailed example of this step.) Substituting everything 
into (HI) and we find that 

hp{P) - ^ logp(1430987165464+ 0(43')) 

= y2 ^^32 (43 • 44668563676+0(43*^)) 
= 43^1 ■ 96127622779+0(43^), 

to precision as desired. 

7. Sample computations 

The author implemented the above algorithms in the computer algebra system 
SAGE |SJ05| , building on an implementation of the algorithm of [MST06| by Robert 
Bradshaw, Jennifer Balakrishnan, Liang Xiao, and the author. The code is freely 
available under a GPL license, and is distributed as a standard component of SAGE 
(version 2.2 and later). An example session: 

sage: E = EllipticCurve("37a") ; E 

Elliptic Curve defined by j~2 + y = x~3 - x over Rational Field 
sage: P = E.gens()[0]; P # a generator of E(Q) 

(0:0:1) 
sage: h = E.padic_height(p=5, prec=5) 
sage: h(P) 

4*5 + 3*5~2 + 3*5~3 + 4*5'4 + D(5~5) 
The examples below were run on an AMD Opteron running at 1.8 GHz, kindly 
provided by William Stein, funded by his NSF grant #0555776. For these examples, 
we did not use the methods of §3.H opting instead to use the trace and determinant 
as a correctness check. 

7.1. Large prime case. Wc will take the elliptic curve '92bl' from Cremona's 
database, which has equation — x'^ — x + 1 and conductor 92 = 2^ • 23. A 
generator of the group of rational points is P = {x,y) = (1, 1). 

Let p = 10^^ + 3 and M = 6. We need to compute E2 mod p"^. Using the 
method of [Har07| , SAGE finds that the matrix of Frobenius on the standard basis 
for Monsky-Washnitzer cohomology is given by 

64304585760876115698175680344198318130013083 42503972380936025561124602310186734870477220 
97128558385368210540568141789457735335273547 35695414251123884302364319655812481869943749 

mod p^. The computation time was 42 minutes. As a consistency check, one may 
verify that the determinant of this matrix is p (mod p'^), and that the trace is 

100000000012000000000540000000010799999956832 ee -43249 (mod /), 

which agrees with other point-counting algorithms. 
From the matrix SAGE finds immediately that 

E2 = 74470168280485533213508423470741122284560152+ 0(/), 

and that the p-adic sigma function is 

ap{t) = t+ (69769590353020230550922850977954746761856727+ 0(/))t3 

+ (5214659177355434704657+ 0(/))t^ + 0{f). 
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Finally, SAGE uses Theorem [3] to find that hp{P) is 

p ■ 9226324270539878944369124959203473806055293044599072658+ 0(p'^). 

This last step is virtually instantaneous. Observe that the original algorithm of 
[MST06j would have required computing the coordinates of the point nP where 
n « 10^^, which would occupy around 10^^ bits of storage — not to mention the 
computation time. Clearly, Proposition [§] is essential for handling such large p. 

7.2. High precision case. Continuing with the same curve, we now consider the 
case p = 5 and M = 3000. Therefore we require E2 modulo 5^^^^. Since p is small 
relative to M, SAGE selects Kedlaya's original algorithm [KedOlj for the Frobenius 
matrix computation. The first and last few digits of E2 are 

E2 = 3 + 2 . 5 + 2 • 53 + 3 ■ 55 + 2 • 5^ + ■ • • + 3 • 5^995 + 3 . 52996 ^ o(52998)_ 

The computation time to obtain E2 was 229 seconds. 

Obtaining the p-adic sigma function took 158 seconds. There are about 3000 
coefficients, each with a similar number of 5-adic digits, so we dare not write it 
down here. For such high precision computations, the original algorithm of |MST06| 
would have been utterly impractical; the contribution from the initial power 
series expansions would multiply the above running time by a factor of perhaps 
10^. 

Finally, computing hp{P) itself from the sigma function took about 6 seconds. 
It turns out to be 

hp{P) = 3 • 5 + 3 • 52 + 2 • 53 + 54 + . . . + 4 • 52998 + 2 • 52999 + 0(53000). 

References 

[Ber] Daniel Bernstein, Fast multiplication and its applications, 

http://cr.yp.to/lineartime/multapps-20041007.pdf, retrieved 4th March 2007. 

[BK78] R. P. Brent and H. T. Kung, Fast algorithms for manipulating formal power series, J. 
Assoc. Comput. Mach. 25 (1978), no. 4, 581-595. 

[Bre76] Richard P. Brent, Multiple-precision zero-finding methods and the complexity of elemen- 
tary function evaluation. Analytic computational complexity (Proc. Sympos., Carnegie- 
Mellon Univ., Pittsburgh, Pa., 1975), Academic Press, New York, 1976, pp. 151-176. 

[Har07] David Harvey, Kedlaya's algorithm in larger characteristic, to appear in Int. Math. Res. 
Not., arXiv preprint math.NT/0610973v2, 2007. 

[KedOl] Kiran S. Kedlaya, Counting points on hyperelliptic curves using Monsky-Washnitzer 
cohomology, J. Ramanujan Math. Soc. 16 (2001), no. 4, 323—338. 

[Lan78] Serge Lang, Elliptic curves: Diophantine analysis, Grundlehren der Mathematischen 
Wissenschaften [Fundamental Principles of Mathematical Sciences], vol. 231, Springer- 
Verlag, Berlin, 1978. 

[MST06] B. Mazur, W. Stein, and J. Tate, Computation of p-adic heights and log convergence, 
Documenta Math. (Extra Volume: John H. Coates' Sixtieth Birthday) (2006), 577-614. 

[MT91] B. Mazur and J. Tate, The p-adic sigma function, Duke Math. J. 62 (1991), no. 3, 
663-688. 

[Sil92] Joseph H. Silverman, The arithmetic of elliptic curves. Graduate Texts in Mathematics, 
vol. 106, Springer- Verlag, New York, 1992, Corrected reprint of the 1986 original. 

[SJ05] William Stein and David Joyner, Sage: System for algebra and geometry experimenta- 
tion, Communications in Computer Algebra (ACM SIGSAM Bulletin) 39 (2005), no. 2, 
61-64. 

[Wut04] Christian Wuthrich, The fine selmer group and height pairings, Ph.D. thesis, Cambridge, 
2004. 



